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Abstract 

Motivated in part by various sequences of graphs growing under random rules (like 
internet models), convergent sequences of dense graphs and their limits were introduced by 
Borgs, Chayes, Lovasz, Sos and Vesztergombi and by Lovasz and Szegedy. In this paper 
we use this framework to study one of the motivating class of examples, namely randomly 
growing graphs. We prove the (almost sure) convergence of several such randomly growing 
graph sequences, and determine their limit. The analysis is not always straightforward: in 
some cases the cut distance from a limit object can be directly estimated, in other case 
densities of subgraphs can be shown to converge. 

1 Introduction 

Convergent graph sequences and their limits have been studied in connection with internet mod- 
els, statistical physics, extremal graph theory, and more. In the context of dense graphs, a rather 
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complete theory has emerged. One can define a notion of convergence based on the convergence 
of densities of subgraphs. An appropriate notion of distance between two graphs, called their cut 
distance, can be defined, so that convergent sequences are Cauchy in this metric and vice versa. 
The completion of the metric space of graphs relative to this metric can be described, and its el- 
ements, i.e., limit objects for convergent graph sequences, can be characterized in various ways. 
To mention one of these, limit objects can be described by 2-variable symmetric measurable 
functions [0,1]^ ^ [0,1]. 

The goal of this paper is study in this framework one of the motivating class of examples, 
namely randomly growing graphs. Typically, such a sequence of graphs grows by every now 
and then adding a new node, and then creating new edges (between the new node and the old 
ones, or between two old nodes) randomly, from some simple distribution determined by local 
conditions. 

We will prove the (almost sure) convergence of several such randomly growing graph se- 
quences, and determine their limit. This analysis is not always straightforward: in some cases 
the cut distance from a limit object can be directly estimated, in other case densities of subgraphs 
can be shown to converge. 

2 Preliminaries 

In this section we summarize those notions and results concerning convergent graph sequences 
and their limits which are relevant for the rest of the paper. 

2.1 Convergent graph sequences 

For two simple graphs F and G, hom(i^, G) denotes the number of homomorphisms (adjacency- 
preserving maps) from V{F) to V{G). We also consider the homomorphism densities 

Jiom(^^ 

- \v{G)\\y(p)\- ^ ' 

(Thus t{F, G) is the probability that a random map of V{F) V{G) is a homomorphism.) 

A sequence (G„) of graphs is convergent, if the sequence t(F, G„) has a limit for every simple 
graph F. 

Convergent graph sequences have a limit object, which can be represented as measurable 
functions [?]• Let W denote the space of all bounded measurable functions W : [0, 1]^ M such 
that W{x,y) = W{y,x) for all x,y e [0, 1]. We also define Wo = {W e W : < < 1}. For 
every simple graph F and W G W, wc define 



t{F,W) = I TT W{x„Xj)dx 



Every finite simple graph G can be represented by a function Wq G Wo^ Let V{G) ~ 
{1, . . . , n}. Split the interval [0, 1] into n equal intervals Ji, . . . , J„, and for x G J,;, y e Jj define 



WG{x,y) 



1, iiijGE{G), 
0, otherwise. 
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Informally, we replace the entry in the adjacency matrix of G by a square of size {^/n) x 
(1/n), and define the value of the function Wq on this square as the corresponding entry of the 
adjacency matrix. 

Graphons represent limits of convergent graph sequences in the following sense. 

Theorem 2.1 (a) For every convergent graph sequence (G„) there is a W £ W such that 
t(F, Gn) t{F, W) for every simple graph F. 

(b) This function W is uniquely determined up to measure preserving transformations in 
the following sense: For every other limit function W there are measure preserving maps 

i> : [0, 1] ^ [0, 1] such that W{^{x), <i>{y)) = W\i,{x)My))- 

(c) Every function W G Wo arises as the limit of a convergent graph sequence. 

Parts (a) and (c) of the theorem were proved in [7], and part (b), in [2j. The proof of (c) in 
[7] depends on M^-random graphs, to be discussed in the next section. 

We could consider any probability space (fi. A, tt) instead of [0, 1], with a symmetric measur- 
able function W : H. x fl ^ [0,1]. These structures are called graphons. The densities t{F, W) in 
a graphon could be defined by a similar integral. Considering graphons would not give greater 
generality, since we could always replace (f2,^, tt) by the uniform measure on [0,1]. Still, it is 
sometimes useful to represent the limit object by other probability spaces, as we shall see. 

2.2 Distance of graphs 

The cut-norm introduced in is defined for G W by 

W{x, y)dxdy 



\W\\n = sup 

S,TC[0,1] 



SxT 



where the supremum goes over measurable subsets of [0, 1]. We define the cut-distance of two 
functions in W by 

Sa{U,W)= inf \\U-W^\\n (2) 

where the infimum goes over all invertible maps (j) : [0, 1] [0, 1] such that both (f> and its inverse 
are measure preserving, and is defined by W'^{x, y) = W{(l){x), ^(j/))- For two graphs G and 
G', this yields a distance 

5u{G,G')^Su{Wg,Wg'). 

Remark 2.2 (a) We call this a "distance" rather than a "metric" since two different graphs can 
have distance 0. This is the case when one graph can be obtained from the other by replacing 
each node by the same number of twins, or more generally, when both can be obtained from 
a third graph this way. To get a metric, we should identify such pairs of graphs. Similarly, 
to get a metric on Wq: we have to identify functions U,W for which 5a{U,W) = 0. Several 
characterizations of such pairs are given in [2J. 

(b) There are combinatorial, but somewhat lengthy ways to define this distance between 
graphs; see [3]. 
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We can define a similar distance function based on other norms. We shall use the Li-norm 



l|W^l|i= / \W{x,y)\dxdy, 

from which we can define the edit distance of two functions in W by 

5^1U,W)^ inf lie/ -W^"^ 111 (3) 

0:[O,l]^[O,l] 

The following characterization of convergent graph sequences was proved in [3] (see [5] for 
other characterizations not used in this paper). 

Theorem 2.3 A sequence of graphs (G„) is convergent if and only if it is Cauchy in the 5^ 
distance. The sequence (Gn) converges to W if and only if S\j{Wg^,W) — > 0. Furthermore, 
there is a way to label the nodes of the graphs in the sequence so that \\Wg„ ~ W\\tj 0. 

If the graphs Gn are labeled so that ||W^g„ — M^IId ^ 0, then 

(n ^ oo) 



sup 

S,T 



(Wg,, - W) 

SxT 



In particular, it follows that 

{Wg„ -W)^0 (4) 



I SxT 

for every product set SxT, which impHes that Wg„ ^ in the weak* topology of Loo([0, 1]^). 
Convergence in the norm ||.||o is, however, not equivalent to convergence in this weak* topology, 
as the sequence prefix attachment graphs shows (Section 13. 3p . 

2.3 ly-random graphs and extensions 

Let {n, A,Tr,W) be a graphon. For every finite subset S C il we define two graphs G{S,W) 
and Il{S,W) on V{G{S,W)) = V{m{S,W)) = S. In G{S,W), we connect i,j £ S, i ^ j with 
probability W{i,j). In IHI(5', W), we connect i,j £ S, i ^ j by an edge with weight W{i,j). If W 
is {0, l}-valued, then G{S, W) — M{S, W) is deterministic, and can be considered as an "induced 
subgraph" . 

Let S„ be a random rt-element subset of (each element of S„ chosen independently from the 
distribution tt). The graph G(n, W) = G(S„, W) is called a W -random graph. The following fact 
was shown in [7] (for the case when the underlying probability space is the uniform distribution 
on [0, 1], but this is no essential restriction of generality). 

Lemma 2.4 With probability 1, the sequence Gr{n,W) is convergent and its limit is represented 
by the function W . 

In this paper, we will also need sequences Sn of subsets of Q that are not random, but still 
G{Sn,W) converges to W. We prove and use the following sufficient condition for a deterministic 
sequence Sn- Let (f2, c?) be a metric space, and tt, a probability measure on the Borel subsets 
of (fi, d). For every n > 1, let Sn be a finite subset of fl such that |S'„| oo. We say that the 
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sequence (5'„) is well distributed in a set X C fi, if |S'„ H Xl/jSnl ^(^) as n — > oo. We say 
that (Sn) is well distributed in {fl, d, tt), if for every £ > there exists a partition {Pi, . . . , P,,,,} 
of Q into sets with diameter at most e such that Sn is well distributed in each P, . 

Lemma 2.5 Let (fi, d, tt) 6e a metric space with an atom-free probability measure. Let W : Q x 
51 [0, 1] be a symmetric measurable function that is almost everywhere continuous. Let 5„ be 
a sequence of sets that is well distributed in tt). 

(a) Then Si{WTsi{Sn,w),W) and with probability 1, So{Wq(^s„,w),W) 0. 

(b) IfW is 0-1 valued, then Si{Wg(s„,w),W) 0. 

It is clear that such a conclusion cannot hold without some assumption on W, since a general 
measurable function could be changed on the sets Sn x Sn arbitrarily without changing its 
subgraph densities. 

Proof, (a) First we construct a special partition of fl. 

Claim 2.6 There exists a sequence of partitions Q„ offl into \Sn\ sets such that every partition 
class contains exactly one point of Sn, the maximum diameter of partition classes tends to 0, 
and the maximum of \'^{Q)\Sn\ — l| (Q £ Qn)> tends to 0. 

Let £ > 0. Consider a partition {Pi, . . . , Pm} into sets with diameter at most e such that 
Sn is well distributed in every P,. For n large enough, we have (1 — £)7r(Pj) < \Sn n P, |/|5„| < 
(1 + £)7r(P,) for every j. Let us partition each set P, into \Sn n Pj \ sets of equal measure, each 
containing exactly one point of Sn H Pj to get the partition Q„. It is clear that this sequence of 
partitions has the properties as required in the Claim. 

For each n and s G Sn, let Qs be the partition class of Q containing s. Define the function 
Wn as follows: for s,s' e Sn and {x,y) e Qs x Qs', let Wn{x,y) = W{s,s'). Then Wn{x,y) —> 
W{x,y) in every point {x,y) where W is continuous, in particular Wn W almost everywhere. 
This implies that 



We can view Wn as Wh„, where is a weighted graph with V{Hn) ~ Sn, the weight of 
node s S 5™ is 7r((5s), and the weight of ss' {s,s' G S) is W{s,s'). Note that iJ„ is almost 
the same weighted graph as H„ = H(5'„,VF): they are defined on the same set of nodes, the 
edges have the same weights, and the nodeweight tt{Qs) is asymptotically l/\Sn\ by the Claim. 
Given £ > 0, we have \tt{Qs) — l/\Sn\\ < £/\Sn\ if n is large enough. Hence there is a measure 
preserving bijection (p ■ [0, 1] [0, 1] and a set P C [0, 1] of measure e such that 



|w^„-w^||i-^o 



(n — ^ oo). 



(5) 



WHAx,y) = wUx,y) 



{x,y i R). 



This implies that 



Ji(i/„,H„) -.0 



{n — > oo). 



(6) 



By Lemma 4.3 from [4 it follows that with probability 1, 



5amSn,W),G{Sn,W))^0 



(n — > oo). 



(7) 
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Equations (H]), © and ^ imply that G{Sn,W) W with probabihty 1. 
(b) follows trivially, since in this case IHI(5„, W) = G{Sn, W). 



□ 



We note that (b) would also follow from the result of Pikhurko [5^ that if a graph sequence 
tends to a 0-1 valued function W in the 6^ distance, then it also tends to W in the Si distance. 

2.4 Pixel picture 

We have seen that every finite simple graph G can be represented by a function Wg G VVq. In 
fact, this representation is very useful for creating figures representing graphs. 

Every function W G Wo can be represented by a grayscale picture on the unit square: the 
point (x, y) is black if W{x, y) = 1, it is white if W{x, y) = 0, and it is appropriately dark grey if 
< W{x, y) < 1. For a graph, this picture gives a black-and-white picture consisting of a finite 
number of "pixels". The origin is in the upper left corner (as for a matrix). Figure [T] illustrates 
this construction. Note that the function associated with a graph depends on the ordering of 
the nodes. 

/01001 10000\ 
1 1 1 
1 1 1 

1 1 1 

1 1 1 
1 1 1 
1 1 1 
1 1 1 
1 1 1 

yooooioiloo/ 

Figure 1: The Petersen graph, its adjacency matrix, and its pixel picture 





Example 1 (Half graphs) Consider the half-graphs Hn,n'- they are bipartite graphs on 2n 
nodes {1, . . . , n, 1', . ■ • , n'}^ where i is connected to j' if and only if j < j' . It is easy to see that 
this sequence is convergent, and to guess the limit function (Figure [2]). 






Figure 2: A half-graph, its pixel picture, and the limit function 
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Example 2 (Erdos-Renyi random graphs) The pixel picture of a random graph is essen- 
tially grey. 




Figure 3: A random graph with 100 nodes and with edge density 1/2 

The following simple example illustrates the importance of the ordering of the nodes: 

Example 3 (Chessboard) The 100 x 100 chessboard in Figure H] is the pixel picture of a 
complete bipartite graph. It is also uniformly grey, so one might assume that it represents 
a graph that is close to random. But rearranging the rows and columns so that odd indexed 
columns come first, we see that it is isomorphic to the graph represented by the 2x2 chessboard. 

This example also shows that different graphs may be represented by the same pixel picture: 
all complete bipartite graphs with equal color classes have the same pixel picture. If we restrict 
our attention to graphs with no twin nodes, the pixel picture will determine the graph. 

The pixel picture of a random graph remains uniformly grey, no matter how you reorder the 
nodes. 

It is easy to verify that 

t{F, G) = t{F, Wg) 

for every finite simple graph G. 




Figure 4: A chessboard and the pixel picture obtained by rearranging the rows and columns 



7 



3 Convergent graph sequences and their Umits 
3.1 Growing uniform attachment graphs 

We generate a randomly growing graph sequence G"^ as follows. We start with a single node. 
At the n-th iteration, a new node is born, and then every pair of nonadjacent nodes is connected 
with probability 1/n. We call this graph sequence a randomly grown uniform attachment graph 
sequence. 




Figure 5: A randomly grown uniform attachment graph with 100 nodes 



Let us do some simple calculations. After n steps, let {0, 1, . . . , n — 1} be the nodes (born in 
this order) . The probability that nodes i < j are not connected is ■ |ii ■ • • = |- . These 
events are independent for all pairs The expected degree of j is 

j-l . n-l . , . / . N 

^ n ^ n ~ 2 2n ' 

i=0 i=j+i 

The expected number of edges is 



2 V 2 In 6 

To figure out the limit function, note that the probability that nodes i and j are connected is 
1 — max(i, If z = xn and j = yn, then this is 1 — max(a:,y). This motivates the following: 

Theorem 3.1 The sequence G"^ tends to the limit Junction 1 — max(a;, y) with probability 1. 

Proof. For a fixed n, the events that nodes i and j are connected are independent for different 
i,j, and so by the computation above, G"*^ has the same distribution as G(S'„, 1 — max(a;,y)), 
where S'„ = {0, 1/n, . . . , (n — l)/n}. It is easy to see that this sequence is well distributed in the 
metric space [0, 1] with the uniform measure, and so the Theorem follows by Lemma 12.51 

One can get a good explicit bound on the convergence rate by estimating the cut-distance of 
Wfiua and 1 — niax{x,y), using the Chernoff-Hoeffding bound. □ 
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3.2 Growing ranked attachment graphs 

This randomly growing graph sequence GJf is generated somewhat similarly. We start with 
a single node. At the n-th iteration, a new node is born, and it is connected to node i with 
probability 1 — i/n. Then every pair of nonadjacent nodes is connected with probability 2/n. 
We call this graph sequence a randomly grown ranked attachment graph sequence. 

Theorem 3.2 The sequence tends to the limit function 1 — xy with probability 1. 
Proof. The probability that nodes i and j are not connected after the n-th step is 



j \ jj \ j + '^J V j{n-l)n 
ij (3n - 3)i3 - "^ni ij 



i2 



jn{n - 1) 



where < qij < mm{^, ij /n^}. Furthermore, these events are independent for different pairs 
i,j. Therefore, we can generate the graph G^f as follows: We generate G(S'„,1 — xy), where 
Sn = {0, . . . , (n — l)/n}, and then connect each nonadjacent i and j with probability 1 —pij. 
Since G(5„, 1 — xy) tends to the function 1 — by Lemma \T5\ and the added edges change 
G{Sn, 1 — xy) negligibly in 5a distance, the Theorem follows. □ 



3.3 Growing prefix attachment graph 

In this construction, it will be more convenient to label the nodes starting with 1. At the n-th 
iteration, a new node n is born, a node z is selected at random, and node n is connected to nodes 
1, . . . ,z — 1. We denote the n-th graph in the sequence by Gl^^^, and call this graph sequence a 
randomly grown prefix attachment graph sequence (Figure [H]). 




Figure 6: A randomly grown prefix attachment graph with 100 nodes, and the same graph with 
nodes ordered by their degrees. 

Again we start with some simple calculations. The probability that nodes i < j are connected 
is (but these events are not independent in this case!). The expected degree of j is therefore 

j-i . . n . . 

^+ — ^-«-^+jln-+on . 
z=i ■' i=j+l 
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The expected number of edges is n{n — l)/4. 

Looking at the picture, it seems that it tends to some function, which we can try to figure 
out similarly as in the case of uniform attachment graphs. The probability that i and j are 
connected can be written in a symmetric form as 



\J 



li i = xn and j = yn, then this is 



max(i, j)' 

\x - y\ 



max(x, y) 

Does this mean that the function U{x,y) = \x — ?/|/ max(x, ?/) is the limit? Somewhat 
surprisingly, the answer is negative, which we can see by computing triangle densities. The 
probability that nodes i < j < k form a triangle is (l — |r)(l — (since if k is connected to j, 
then it is also connected to i). Hence the expected number of triangles is 

j\ f i\ 1 f n 



Hence 



On the other hand, 



i<j<k 



1 fn\ 1 



t{K3,U)^ J r 7 7 J -dxdydz. 

J[Q i]3 max(x, y) max(a;, zj ma.x(y, z) 

Since the integrand is independent of the order of the variables, we can compute this easily: 
t(i^3,C/) = 6 / fl--) (l--) (l-^) dxdydz 

Jo<x<y<z<l V yj ^ ^ ^' 36 

So V is not the limit of the sequence G^^. On the other hand, it is not hard to verify that 

(M^QPfx - ly) ^ (8) 

SxT 

for every S,T C [0, 1]. Indeed, it is enough to prove this for sets S,T from a generating set of 
the cr-algebra of Borel sets, e.g. rational intervals. Since there is only a countable number of 
these intervals, it suffices to prove that ([8]) holds with probability 1 for each fixed S and T. It is 
also easy to see that it suffices to consider the case S = T. For a node j with j/n £ S, let Xn.j 
denote the number of edges ij {i < j) in G^^^ with i/n G S, and let Xn = J2j/nes -^nj- Then 
direct computation shows that 

^E(X„) - / U. 

^ JSxS 

Furthermore, the variables X„j are independent for fixed n, hence the Chernoff-Hoeffding In- 
equality implies that P(|X„ — E(X„)| > en^) drops exponentially with n. Hence it follows that 
Xn/n^ Js^gU with probability 1. 
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So W^-pt^ — > in the weak-star topology of -Loo[0, 1]^, but not in our sense. This example 
also shows that had we defined convergence of a graph sequence by this convergence in weak-star 
topology (after appropriate relabeling), the limit would not be unique. 

Perhaps ordering the nodes by degrees helps? The second pixel picture in Figure [6] suggests 
that after this reordering, the functions VF^ptx converge to some other continuous function. But 
again this convergence is only in the weak-star topology, not in the S^j distance. We'll see that 
no continuous function can represent the "right" limit: the limit graphon is 0-1 valued, and it 
is uniquely determined up to measure preserving transformations by Theorem l2.11 which do not 
change this property. 

Is this graph sequence convergent at all? Our computation of the triangle densities above 
can be extended to computing the density of any subgraph, and it follows that the sequence of 
densities t{F, Gf^^^) is convergent for every n. How to figure out the limit? 

Let us label a node born in step fc, connected to {1,...,to}, by (k/n,m/k) g [0,1] x [0,1]. 
Then we can observe that nodes with label {xi, yi) and (2:2, ^2) connected if and only if either 
xi < X2y2 or X2 < xiyi. 

Consider the function W : [0, 1]^ x [0, 1]^ [0, 1], given by 



1, if xi < X2y2 or X2 < xiyi, 
0, otherwise. 



Typf'^((xi,yi),(x2,2/2)) 
Proposition 3.3 The prefix attachment graphs Gf^^'^ tend to W^^^ with probability 1. 



Proof. Let Sn be the (random) set of points in [0,1]^ of the form {i/n,Zi/i) where i ~ 
1, . . . , n and Zi is a uniformly chosen random integer in {1, . . . ,i}. Then G^^^ = G(5„, TyP^'') — 

m{Sn,wp'^). 

Furthermore, with probability 1, the sets Sn are well distributed in [0, 1]^. Indeed, for m > 1, 
let Jm.k denote the interval {k/m, {k + l)/m], and let Vm denote the partition of [0, 1]^ into the 
sets Jrn,k X Jm.i {k,l — 0, . . . , TO— 1). Wc want to prove that for every fixed m and < fc, ^ < to— 1, 
\Sn n {Jm,k X Jm.i)\/n ^ 1/to^ as tt. — > cxd with probability 1. Let 



then 
We have 

E(X.) = 
and hence 

Els',! n {Jm,k X Jm,l)\ 



1, if {i,Zi) e J.,n,k X Jm^l, 

0, otherwise, 

n 

\Sn n {Jm,k X Jm,l)\ — Xj. 




{l + l)i 



k + i 



k i 
if — < - < 
m n TO 

otherwise, 



E 

1 

TO 



( 


{l + l)i 




li 


) 




TO 




TO 





E - 



O(logn) 



( 


(fc + l)n 




kn 


) 




TO 




m 





0{logn) = — + 0{logn). 
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Thus 



'-(-\Sn n {Jm.k X J,n,l)\) ^ ^ 



(n — > oo). 

/ tlL~ 

The fact that |5'„ n {Jm,k x "^m,;)!/" ^ 1/m^ with probabihty 1 (not just in expectation) follows 
by the Law of Large Numbers, since the Xi are independent. 

Thus Lemma [23] applies and proves the Proposition. □ 

Lemma [^751 in fact implies (since W^^^ is 0-1 valued) that VFg,pfx tend to W^^^ with probability 
1 in the edit distance, not just in the cut distance. This means that while the graphs Gf^^^ are 
random, they are very highly concentrated: two instances of Gf^^^ only differ in o(n^) edges if 
overlayed properly (not in the original ordering of the nodes!). Informally, they have a relatively 
small amount of randomness in them, which disappears as n — + cx). Indeed, Gf^^^ is generated 
using only O(nlogn) bits, as opposed to, say, G(n, 1/2), which is generated using (2) bits. It 
would be interesting to explore this phenomenon. 

Proposition 13 .31 gives a nice and simple representation of the limit object with the underlying 
probability space [0, 1]^ (with the uniform measure). If we want a representation on [0, 1], we 
can map [0,1] into [0,1]^ by a measure preserving map (p; then Wpf^{x,y) = W^^'^ {(l){x) , (l){y)) 
gives a representation of the same graphon as a 2-variable function. For example, using the map 
(f> that separates even and odd bits of x, we get the fractal- like picture in Figure [T] 

It is interesting to note that the graphs G(n, W) form another (different) sequence of random 
graphs tending to the same limit W with probability 1. 




Figure 7: The limit of randomly grown prefix attachment graphs (as a function on [0, 1]^) 



3.4 Preferential attachment graph on n fixed nodes 

A preferential attachment graph with n fixed nodes and m edges PAG(n, m) is the random graph 
obtained by the following procedure. Let vi . . .Vn be a set of nodes. We extend this sequence 
one by one by picking an element of the current sequence randomly and uniformly, and append 
a copy of it at the end. We repeat this until 2m further elements have been added. So we get a 
sequence Vi . . . w„w„+i . . . w„+2m- 

Now we connect nodes Vn+2k-i and Vn+2k for fc = 1, 2, . . . , m, to get PAG(n, m). (Note that 
PAG(n,m) may have multiple edges and loops, which we have to live with for the time being). 
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Another way of describing this construction is to view it as adding edges one by one, where 
the probabihty of adding an edge connecting u and v is proportional to the product of their 
degrees. To be more precise, the probabihty that the (fc + l)-st edge connects u and v is 

(2{dk{u) + l){du{v) + l) 



{n + 2k){n + 2k + I) 

(dfe(M) + l)(dfe(u)+2) 



if u ^ w, 
if w = 



(n + 2fc)(n + 2A: + l) 

where dk{u) is the current degree of the node (adding 1 to the degree is needed to start the 
procedure at aU; adding 2 to the second factor in the case when u = w is a minor trick that 
makes everything come out nicer). 



■ ■ ■ HI I 1^ 



;]rp^:i:: 



^ - ., ■'■ ■ ■ ' 

Figure 8: (a) A preferential attachment graph PAG(50, 1000). Darkness of a pixel indicates 
multiplicity of the edge, (b) The same graph with the nodes ordered by decreasing degrees. 

Preferential attachment graphs are motivated by the (sparse) Albert-Barabasi graphs [T], 
and they have been studied in detail by Pittel 0. 

The somewhat awkward definition of preferential attachment graphs is justified by the fol- 
lowing nice properties. First, let us compute the probability that this process yields a multigraph 
G on V{G) = [n], with degrees di, . . . , c?„, with m edges and m' non-loop edges. Fix any order 
of the edges, and for the non-loop edges fix an order in which their endpoints were inserted (i.e., 
an orientation of G). Then the probability that G arises this way is 

(9) 

n{n+l)...{n + 2m-l)' ^ ' 

Summing over all orientations and orderings of the edges, we get that the probability that 
PAG(n,m) = G is 

m!2"^ "i--;-^"- ^. (10) 

n(n + 1) . . . [n + 2m — 1) 

An important observation we can make from this computation is the following: 

Lemma 3.4 Conditioning on the graph G{n,m), all the 2™ m! possibilities in which the edges 
could have been inserted have the same probability. 

We can use this lemma to determine the expected subgraph densities in PAG(n,TO). For 
two multigraphs F and G, let inj{F, G) denote the number of embeddings of / into G, i.e., the 
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number of pairs {(p, "ip) of injective maps (p : V{F) V{G) and ip '■ E{F) E{G) that preserve 
incidence. Let '{^F G) 

^inj {F, G) = - - , 

(n)k 

where k = \V{F)\ and n = \V{G)\. 

Let F be a multigraph on V{G) = [k], with degrees ri,. . . ,rk, with I edges and non- 
loop edges. Fix an order of the edges of F and also an orientation a of the non-loop edges 
of F as above. Let ~ei, ■ ■ ■ ,~e?m be the order and orientation in which PAG(n,TO) arises. Let 
p(tT, vi, . . . ,Vk, ji, . . . , ji) denote the probability that edges ~e , ■ ■ ■ , ~eji form a copy of F on 
nodes vi, . . . ,Vk (with the given labeling of the nodes, the given order of the edges, and the given 
orientation). By Lemma |3.4[ this number is the same for any ^-tuple (ji, . . . ,ji), and trivially, 
it is the same for every /c-tuple (fi, . . . , t;^). Hence 

E(inj(F,PAG(n,m))) = ^ ^ ^p{a,vi, . . . ,Vk, ji, . . . , ji) 
= (n)fe(r7i)/2' p{(To, 1, . . . ,k,l, . . . ,1), 



where ao is any fixed orientation of F. By we have 

p(cro, 1, ... 1, ... ,0 



ri! . . .Tfc! 



and so 



n)k [n 
Suppose that n, m — ^ cx) so that m ^ cr? 12. Then 



E(ti„j(f,PAG(n,m))) ^ (n)fc(m),/ /'^;-,/^- ^ 2''ri! . . . r^!™. (11) 

\n)k [n + 2l-l)^i [n)2i 



E(ii„j (F, PAG(n, m))) ^ 2' n! . . . rfe!-^ 2' -'c'n! . . . rfe! 



If we assume that F has no loops, then 

E(ti„j(F,PAG(n,m))) c'n! . . .r^! . 

Using high concentration results, one can show that this convergence holds not only in expecta- 
tion, but with probability 1, 

iinj(F,PAG(n,m)) ^cVi!...rfc!. 

Note that the relation tinj(-F, PAG(ri, m)) ~ t{F,VAG{n,m)) does not hold in general if F 
has multiple edges. In fact, it is easy to see that 

ti„j(F, PAG(n, m)) ^ *(^' PAG(n, m)) J] }' 

where f ranges through all multigraphs obtained from F by reducing the edge multiplicities 
(not strictly, but keeping at least one copy of each edge), and m^^- denote the multiplicities 
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of the edge ij in F and F', respectively, and {^} denotes the Stirling number of the second kind. 

(2) 

For example, if K2 denotes the graph on two nodes having two parallel edges, then 
t(iff\PAG(n,TO)) - tinj(iff\PAG(n, m)) + tinj{K2,PAG{n,m)). 
Let Lc{x,y) = c(lna;)(ln j/). Then for a multigraph F without loops we have 



t{F,Le) = 




This implies that the limit of preferential attachment graphs PAG(7i, crt^), with probability 
1, is described by the function Lc- To be precise, the graphs PAG(n, cn^) have multiple edges, 
and hence the theory of convergent graph sequences developed in [31 [S] does not apply, but the 
computations above show that the convergence does occur in at least one possible sense. 

Proposition 3.5 Ifm(n) = (c + o(l))n^, then with probability 1, iinj (F, PAG(n, m)) t{F,Lc) 
for every multigraph F without loops. 

Let SPAG(n, cn?) denote the simplified preferential attachment graph obtained from 
PAG(n, cn^) by deleting loops and keeping only one copy of parallel edges. L. Szakacs [TU] 
proved that this sequence of graphs is convergent with probability 1 , and its limit is the function 
1 — exp(— clnxlny). 

4 Convergence to a prescribed function 

Lemma 12.41 gives a way to construct a randomly growing graph sequence converging to a 
given function W. Let si,S2,--- G be independent random samples from tt, and let 
Sn = {si, ■ ■ ■ ,Sn}- We can construct G{Sn,W) by taking G{Sn-i,W), adding s„ as a new 
node, and connecting s„ to with probability W{sn,Si). Then G{Si,W),G{S2,W), . . . is a 
randomly growing sequence of graphs, and by Lemma [2131 we have G(S'„, W) W with proba- 
bihty 1. 

However, one can have several objections to this method: First, the new edges are not added 
independently of each other. Second, even if = [0, 1], and the function W is, say, continuous 
and monotone, the order in which the nodes of G(S'„, W) are born is random, and has nothing 
to do with the order of the points s; G [0, 1] representing them. In other words, to get a labeling 
for which Wq(^s„,w) ^ W in the norm, we have to reorder the nodes. 

It may be interesting to consider rules for generating randomly growing graph sequences 
(G„) with a prescribed limit function W for which these objections cannot be raised. Given a 
function W G Wo 7 monotone decreasing in each variable, construct a randomly growing simple 
graph sequence (Gi, G2, . . . ) as follows. Gi is a single node labeled 1. For n > 1, define 



^ n—1 ' n—1 ' 
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To get Gn from Gn~i, we add a new node n, connect it to each node j < n with probabihty 
Pn,j, and connect any two nonadjacent nodes i,j<n with probabihty Pn,ij- AU these decisions 
are independent. The monotonicity of W imphes that < PnSj < 1 is a legal probability. 

Proposition 4.1 The sequence of graphs Gn constructed above has the property that Wq^ W 
in the ||.||d norm. 



Proof. The probability that nodes i < j are not connected in Gn is 

(l-p,,,)(l-p, + l..,)---(l-PrM,) = {l-W{^,l)) 



and hence the probability that they are adjacent is W{^, j^). Thus G„ is the graph G(5'„, W), 
where Sn — {^i • ■ ■ ^^^}- It is trivial that this sequence of sets is well distributed in [0, 1], 
and since W is almost everywhere continuous, it follows by Lemma 12.51 that G„ W with 
probability 1. □ 



The convergent sequences discussed in Sections [3.11 and [3T2] are special cases of this construc- 
tion. A more general nice case is when W ~ 1 — U, where U is homogeneous of some degree: 
U{Xx, Xy) — X'^U{x, y) with some c > 0. When a new node n is born we connect it to node i < n 
with probability W{-^, 1), and then at each further step, we connect any two nonadjacent nodes 
with probability 1 — (^tt")'^- 
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